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INTRODUCTION 

There  has  been  much  work  recently  to  compose  computer  models  of 
the  cognitive  processes  that  govern  the  thinking  and  perception  of  humans 
(see  Anderson  1983;  Marr  1982;  Newell,  1980;  Pinker,  1985).  Many  of 
these  models  are  based  on  empirical  relationships  derived  from  analysis  of 
data  taken  from  human  subjects  in  experimental  tasks.  Such  tasks  are 
employed  to  gain  understanding  of  human  cognitive  events.  To  the  extent 
that  these  events  may  be  simulated  via  artificial  intelligence  techniques  -in 
particular  machine  learning-  one  can  conclude  that  the  reality  of  cognition 
is  indeed  operable.  Although  computer  models  have  usually  been  proposed 
for  problem  solving  and  perception,  there  has  been  a  dearth  of  application 
to  the  areas  of  attention  and  processing  between  the  two  cerebral 
hemispheres  of  the  brain.  Hence,  the  intent  of  this  report  is  to  propose  a 
machine  learning  model  that  categorizes  patterns  in  accordance  with 
properties  of  attention  and  cerebral  laterality.  The  domain  of  face 
recognition  will  be  used  to  demonstrate  human  and  machine  learning.  The 
content  of  this  model  will  be  a  summarization  of  results  collected  from 
subjects  in  a  series  of  match-to-sample  recognition  tasks  (see  Katsuyama, 
McNeese,  &  Schertler,  1987;  Katsuyama  &  McNeese,  1987;  McNeese  & 
Katsuyama,  1987).  The  machine  learning  mechanism  will  be  proposed  as  a 
successive  evolution  of  categorization  programs.  The  initial  implementation 
will  be  solely  based  on  Quinlan's  (1986)  ID3  algorithm. 

A  basic  question  to  address  is  why  one  would  select  machine  learning 
programs  as  they  might  be  unnecessary  to  model  the  processes  referred  to. 
The  answer  involves  the  nature  of  the  phenomenon  which  is  to  be  modeled. 
The  data  collected  yield  results  of  a  very  dynamical  nature.  Humans 
employ  different  strategies  for  recognizing  faces  as  a  function  of  the 
cognitive  demands  of  a  task  as  well  as  subsequent  learning  of  familiarity  of 
a  given  face.  So,  one  of  the  reasons  to  employ  machine  learning  is  simply 
to  see  if  simulated  mechanisms  of  learning  can  model  human  learning. 
Therein,  one  can  ascertain  a  goodness-of-fit  between  the  algorithm  and  the 
actual  learning  observed  in  human  subjects.  Selection  of  classification 
programs  and  transitions  toward  conceptual  clustering  programs  are 
specifically  proposed  as  they  tend  to  coincide  with  the  theoretical 
explanations  of  what  a  human  might  be  doing  to  recognize  a  pattern,  (e.g.,  a 
face). 
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In  particular,  Quinlan  (1986)  points  out  that  knowledge-based  expert 
systems  define  learning  as  the  acquisition  of  structured  knowledge  in  the 
form  of  concepts,  nets,  or  rules.  Machine  learning  has  come  to  be  an 
important  adage  to  these  type  of  systems  as  it  can  be  used  to  impart 
knowledge  acquisition  without  expert  intervention.  The  use  of  examples  to 
induce  decision  trees  by  the  ID3  program  will  be  reviewed  in  the  next 
section.  It  is  in  this  context  that  the  CEREBRAL  WEEVIL  model  will  grow. 

Much  of  the  results  in  the  neuropsychology  literature  posit  that  a 
person  comes  to  classify  a  face  (in  this  case  the  face  is  seen  as  the  "object") 
by  different  strategies  which  seem  to  be  associated  with  either  the  left  or 
right  cerebral  hemisphere.  Thus,  when  a  subject  is  presented  a  face  to 
recognize,  classification  may  occur  by:  1.)  piecemeal  processes  or  2.) 
configurational  processes.  Each  of  these  processes  are  initiated  by  different 
conditions  or  attributes  which  are  perceived  by  the  person.  Classifications 
based  on  piecemeal  recognition  occur  usually  in  the  right  hemisphere  and 
focus  upon  finding  specific  features  for  recognition  (e.g.,  a  nose);  whereas, 
configurational  classifications  rely  upon  the  person  constructing  a  prototype 
schema  (e.g.,  distances  between  the  eyes,  nose  and  mouth).  There  surely  is 
a  similarity  between  object  classification  learning  in  programs  such  as  ID3 
and  object  classification  using  the  advantages  inherent  in  each  cerebral 
hemisphere. 

The  goal  of  this  project  is  to  emulate  the  cognitive  processes  that  allow 
a  person  to  perform  most  efficiently  on  a  specific  task.  It  is  at  this  point 
that  it  might  prove  beneficial  to  think  of  the  machine  learning  program  as  a 
replacement  of  the  human  in  the  experimental  task.  Although  the  task  will 
have  to  be  implemented  within  the  constraints  of  the  representation 
language  and  the  elements  of  repetition  within  the  tasks  may  have  to  be 
lessened,  the  program  used  should  be  emulative  of  the  task  itself.  The 
inputs  and  outputs  of  task  description  will  be  provided  in  a  later  section, 
but  first  the  idea  of  efficient  performance  by  the  human  needs  to  be 
addressed  more  precisely. 

One  of  the  main  dynamic  components  of  efficiency  on  cognitive  tasks 
is  the  amount  of  attentional  resources  a  person  has  available  for 
expenditure.  If  the  resources  are  sufficiently  low  for  single  or  dual  tasks, 
then  performance  may  suffer.  Friedman  &  Poison  (1981)  propose  that  each 
cerebral  hemisphere  provides  the  human  with  separate,  equivalent  pools  of 
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attentional  resources,  yet  each  pool  is  not  accessible  by  the  other.  This  is 
one  of  the  first  hypotheses  to  connect  attention,  hemispheric  asymmetries, 
and  performance  into  a  coherent  framework.  Some  of  the  theoretical 
predictions  in  hemispheric  performance  can  be  derived-in  part-  from  this 
theory.  Attention  is  mentioned  only  to  introduce  the  attribute  of  resources 
which  becomes  important  for  a  dynamic  machine  learning  system  to 
capture.  Note  too  that  the  more  evolved  notions  of  machine  learning 
programs  may  be  more  attuned  to  handle  this  greater  complexity. 
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RELATED  WORK 


The  basis  for  the  machine  learning  model  is  ID3  (Quinlan,  1986)  which 
is  a  nonincremental  system  that  searches  for  patterns  in  examples  after 
successive  iterations  of  different  training  sets.  This  can  be  identified  as  a 
data-driven  approach  to  learning  by  example.  The  BACON  (Langley, 
Bradshaw,  &  Simon,  1983)  and  the  INDUCE  (Michalski,  1980)  programs  also 
use  data-driven  approaches  to  classification.  These  types  of  programs  were 
selected  as  they  share  common  mechanisms  with  the  proposed  human 
pattern  recognition  system  as  derived  from  the  target  experimental  task 
and  procedure  (Katsuyama,  McNeese,  and  Schertler,  1987).  The  1D3 
program  inducts  a  decision  tree  for  classifying  objects  based  upon 
particular  values  of  attributes  which  identify  these  objects.  The  trees 
themselves  are  composed  with  these  very  attributes.  CEREBRAL  WEEVIL 
is  initially  formulated  as  a  protracted  version  of  the  ID3  algorithm, 
hereafter  referred  to  as  ID3-CPO  (Conceptually  Pleasant  Overextension). 
However,  one  of  the  major  issues  is  the  extent  to  which  a  nonincremental 
approach  is  advantageous. 

Another  type  of  approach  would  be  the  incremental  conceptual 
clustering  proposed  in  Fisher's  COBWEB  program  (1987).  This  approach 
would  tend  to  allow  the  model  to  be  operational  in  real  world  and 
contextual  environments  that  often  increment  observations  which  can 
effect  the  results  of  classification.  This  is  in  contrast  to  ID3  which  is  not 
responsive  to  order  of  acquisition.  Dependent  upon  the  circumstances 
specified  for  operation  of  the  model,  this  may  be  useful.  One  may  desire 
the  incremental  progression  of  classification  but  one  may  not  want  to  be 
tied  down  by  order  of  presentation.  Yet,  another  advantage  of  the  COBWEB 
work  is  the  extent  of  prediction  or  inference  of  unseen  object  properties. 
Once  again  this  would  be  an  analogous  mechanism  similar  to  the 
experimental  task  in  face  recognition  under  ecological  conditions  of 
everyday  encounters.  The  advantage  may  be  specified  as  relating  to  the 
basic  level  in  human  classification  systems.  The  basic  level  lies  between 
the  overly-general  and  the  restrictive-specific  classes  which  a  human  can 
access.  Mervis  &  Rosch  (1981)  indicate  that  these  basic  kinds  of  classes  are 
retrieved  more  quickly  than  these  other  classes  and  are  hypothesized  to  be 
where  inference-related  abilities  are  maximized  in  humans.  The  question 
for  this  model  is  whether  this  is  true  for  face  recognition.  A  key  point  in 
COBWEB  and  basic  kinds  is  the  reliance  on  probabilistic  representation  of 
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attribute  values.  This  supplies  a  representation  that  reflects 
incrementation  and  acts  to  place  an  object  in  a  given  class.  In  essence, 
probability  replaces  logical  operators  in  a  tree.  This  type  of  representation 
may  yield  an  advantage  for  some  of  the  attributes  in  the  model. 

An  important  point  to  make  is  that  the  ID3  algorithm  may  be  more 
suitable  to  replicate  data  already  summarized  for  an  experimental  study, 
whereas,  the  COBWEB  algorithm  would  be  more  useful  for  a  model  in  the 
real  world  perception  of  faces  where  probability  of  classification  is  fully 
developed  and  often  times  automatic.  Note  here  that  an  observation  is 
being  made  that  the  data  taken  from  the  experiment  may  not  be 
generalizable  to  the  real  world  itself.  One  of  the  key  factors  in  this 
dilemma  is  this  continuum  of  face  recognition  development  which  spans 
from  the  unfamiliar  to  the  highly  recognizable.  This  imparts  major 
implications  for  the  attention  facet  described  previously. 

One  way  to  impart  generalizability  with  specificity  may  be  to  create  a 
model  with  both  logical  and  probabilistic  representations.  This  would  be 
useful  for  summary  as  well  as  noisy  context  situations.  Sclimmer's  (1987) 
STAGGER  system  may  provide  the  foundation  for  just  such  an  approach. 
STAGGER  allows  concept  acquisitions  from  conjunctive,  disjunctive,  and 
negated  descriptions  from  noisy,  discrete,  and  real-valued  attribute 
representations.  The  system  explores  the  useful  idea  of  adapting 
representations  to  accommodate  types  of  concepts  to  be  classified.  It  is  in 
this  spirit  that  the  CEREBRAL  WEEVIL  model  is  launched,  although  within 
the  frugal  confines  of  an  ID3-CPO.  The  use  of  rules  of  engagement  to 
change  the  way  the  system  classifies  objects  via  the  ID3  algorithm  is 
attempted.  Rules  of  engagement  are  actually  conditions  associated  with 
certain  classifications.  They  are  presented  as  means  of  adapting 
performance  as  a  function  of  specific  values  of  attributes  which  formulate  a 
given  classification.  This  is  a  kind  of  adaptive  hemispheric  learning 
procedure.  Within  the  ID3-CPO  version,  these  rules  merely  provide  advice 
on  how  the  system  should  change,  whereas  in  a  version  that  can  readily 
access  working  memory,  these  rules  would  actually  alter  values  in  the  state 
of  the  system. 

One  problem  anticipated  is  the  transition  required  between 
nonincremental  and  incremental  absorption  of  examples.  Theoretically,  the 
system  needs  to  assess  states  of  hemispheric  processing  at  strategic 


5 


criterion  levels  as  well  as  individual  example  levels.  The  rationale  here  is 
that  the  individual  examples  are  the  dynamics  which  constantly  feed  into 
the  formation  of  general  strategies.  Thereby,  one  look  must  be  at  the 
strategic  level  across  all  available  examples  (nonincremental)  and  another 
will  need  to  access  the  example  by  itself  (incremental).  This  paper  only 
looks  at  the  implementation  of  the  strategic  state,  but  in  order  to  develop  a 
complex  representation  the  incremental  state  would  also  be  necessary.  An 
attempt  will  be  made  to  remedy  this  by  using  rules  of  engagement  that  try 
to  contextualize  incrementation  and  probabilistic  notions.  In  the  concluding 
remarks  section,  discussion  will  again  focus  on  integrating  incremental 
with  nonincremental  methods. 

As  elaborated,  another  principle  or  mechanism  which  surfaced  in  the 
experimental  data  was  the  supplication  of  attentional  resources  with  the 
consequent  development  of  automatic  processing.  This  is  conceptually 
similar  to  the  mechanism  of  "chunking"  in  the  SOAR  program  (Laird, 
Rosenbloom,  &  Newell,  1986).  Chunking  is  derived  from  the  idea  that 
performance  improves  via  practice  and  that  a  series  ot  subgoals  performed 
initially  may  subsequently  be  replaced  with  learned  chunks. 
Concomitantly,  the  amount  of  effort  to  process  a  chunk  is  substantially  less 
than  that  to  process  each  subgoal  in  sequence.  Chunking  nests  well  with 
two  facets  of  hemispheric  recognition.  First,  it  relates  directly  to  a  person 
developing  prototypes  to  use  in  recognition  which  seems  to  be  based  on  a 
person's  familiarity  with  an  item.  Second,  the  entire  development  cycle  of 
"chunking  faces"  dynamically  occurs  through  switching  between  the  left 
and  right  cerebral  hemisphere  while  exhausting  various  numbers  of 
resources.  Once,  fluency  completely  develops  for  a  face,  the  amount  ot 
processing  resources  used  is  significantly  reduced.  This  is  also  mentioned 
in  the  cognitive  attention  literature  (Shiffren  &  Schneider,  1977).  Chunking 
may  very  well  be  fortuitous  for  a  generalized  system  for  scale-up; 
whereby,  generalized  recognition  looms  as  the  necessary  and  sufficient 
condition  for  successful  performance.  Other  programs  which  create  a 
mechanism  such  as  SOAR  may  be  tempting  to  use.  Korf’s  (1985)  use  ot 
macro-operators  and  Anderson's  work  (1986)  with  knowledge  composition 
in  ACT*STAR  are  supportive  of  this  direction  but  they  tend  not  to  tie-in  as 
well  with  the  overall  model  as  SOAR  does. 
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THE  DYNAMICS  OF  THE  CEREBRAL  WEEVIL  SYSTEM 


Because  of  the  similarity  of  the  experimental  procedure  and  the 
machine  learning  program,  it  will  be  instructive  to  review  the  hemispheric 
face  recognition  task.  Likewise,  a  review  of  the  independent  variables 
manipulated  in  the  experiment  provide  a  partial  basis  for  determining  the 
attribute-value  parameters  for  the  model.  Finally,  the  results  of  the 
experiment  provide  the  basis  for  hemispheric  dynamics  which  will 
subsequently  provide  the  rules  of  engagement  for  the  model  as  well  as  the 
remainder  of  the  attributes-value  parameters.  The  goal  of  this  part  of  the 
paper  is  to  outline  the  task,  inputs,  outputs,  and  dynamics  -in  conjunction 
with-  a  high  level  description  of  the  algorithm  implemented. 

CEREBRAL  WEEVIL  is  implemented  through  the  MacSMARTS™  (1987) 
knowledge  system  environment  on  a  Macintosh  Plus  computer.  The 
following  discussion  is  highly  aligned  with  the  tools  and  faculties  of  this 
environment.  For  further  information  on  using  this  environment  to  create 
the  WEEVIL,  please  refer  to  the  user  manual  described  in  Appendix  A. 
Note  that  MacSMARTS™  embeds  the  ID3  algorithm  within  an  expert  system 
shell,  wherein  examples  or  rules  may  be  used  to  compose  knowledge  bases. 

A  Database  of  Patterns  with  Specific  Attributes 

Within  the  actual  experiments  that  predicated  this  work,  subjects 
were  presented  288  trials  of  faces  for  recognition  patterns.  The  design 
consisted  of  the  independent  variables  of  Viewing  Perspective  (front,  3/4, 
or  side  view)  crossed  with  Hemispheric  Access  (right  or  left  cerebral 
hemisphere  tachistoscopically  presented  stimuli)  crossed  with  Trial  Block 
Familiarity  (1,  2,  3,  4);  hence  creating  a  3  x  2  x  4  experimental  design.  Each 
trial  block  consisted  of  3  x  2  possible  conditions,  wherein  each  condition 
was  repeated  12  different  times.  Each  trial  block  consisted  of  6  x  12  or  72 
total  possible  conditions.  These  are  the  experimental  conditions  that 
yielded  the  results  which  are  to  be  modeled.  For  reasons  of  parsimony,  288 
trials  will  not  be  used.  The  model  will  just  initially  use  a  universe  of  35 
example  conditions. 

The  specific  attributes  used  to  impose  characteri  .ation  of  a  given  face 
are  as  follows: 
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Internal  Stimulus  attributes 

1. )  Perspective  of  Transformation  (P) 

2. )  Familiarization  (F)  (exposure  frequency  of  a  given  face  across 

trial  block),  and 

External  Stimulus  context  variables 

3. )  Attentional  Resources  (R)  required  to  process  face  per 

hemisphere 

4. )  Hemisphere  (H)  to  which  a  face  is  initially  presented  to. 

Thus,  for  each  face  there  are  a  total  of  4  attributes.  The  model  will  begin 
with  certain  built-in  resource  capabilities  but  could  be  dynamically 
updated  by  the  rules  in  conjunction  with  the  type  of  face  processed 
(assuming  access  to  working  memory  is  available).  One  of  the  interesting 
facets  could  be  whether  the  categorization  from  ID3  can  be  used  to 
subsequently  reduce  resources  required  to  process  the  face.  Table  1  shows 
how  the  combination  of  attribute-value  pairs  change  across  example  faces. 


Table  1 

Attribute  Values  Per  Object  Face 


ATTRIBUTE  VALUES 

Face  (P  -F  -H)  Resources  Expended  per  Task  Demands 


SINGLE 

DUAL 

NO  TRANSFER 

1-3 

F-l-L 

LO 

MED 

MED-HI 

4-6 

NF-l-L 

MED 

MED-HI 

HI 

7-8 

F-2-L 

LO 

MED 

MED 

9-10 

NF-2-L 

MED 

HI 

HI 

11-12 

F-3-L 

LO 

LO 

MED 

13-14 

NF-3-L 

MED-H1 

HI 

HI 

15-16 

F-4-L 

LO 

MED 

MED 

17-19 

NF-4-L 

MED 

MED-HI 

HI 

20-21 

F-l-R 

LO 

MED 

MED 

22-23 

NF-l-R 

MED-HI 

HI 

HI 

24-25 

F-2-R 

LO 

MED 

MED 

26-27 

NF-2-R 

LO 

MED 

MED 

28-29 

F-3-R 

LO 

MED 

MED 

30-31 

NF-3-R 

MED 

MED-HI 

MED-HI 

32-33 

F-4-R 

LO 

MED 

MED 

34-35 

NF-4-R 

MED-HI 

HI 

HI 

THUS  PARTIAL  CROSSING  OF  RESOURCES  WITH  OTHER  ATTRIBUTE 
VALUES  YIELDS  A  TOTAL  OF  35  EXAMPLE  FACES  FOR  l  REPITITION. 

P“  Perspective,  F*  Familiarity,  W-  Hemisphere  accessed, 

NF=  Non-fronial,  F=  Frontal, 

1=  Unfamiliarity,  2=  Low  familiarity,  3=  Medium  familiarity,  4=  High  familiarity 
L-  Left  hemisphere,  R=  Right  hemisphere 
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Note  that  we  have  just  used  the  nominal  values  of  low,  medium, 
medium-high,  and  high  for  specifying  attentional  resource  levels  in  a  given 
example  (and  the  engagement  rules)  and  that  we  have  reduced  perspective 
down  into  two  values  (i.e.,  frontal  versus  nonfrontal  faces).  The  resource 
values  are  representative  of  single  and  dual  task  conditions  (face 
recognition  plus  semantic  recognition  simultaneously  processed).  Note 
however,  that  every  value  of  the  resource  attribute  will  not  be  fully 
crossed  with  the  other  3  attributes.  This  is  due  to  the  fact  that  the  same 
resource  expenditure  may  occur  with  different  attribute  valuations.  The 
prescription  of  resources  that  each  face  exhausts  from  a  hemispheric  pool 
was  taken  from  actual  subject  difficulties  encountered  with  each  face  in 
conjunction  with  other  data  about  hemispheric  processing. 

One  of  the  dynamics  of  processing  is  for  a  comparison  to  be  made 
between  the  current  levels  of  resources  specified  in  the  hemisphere 

accessed  and  the  amount  of  resources  that  the  current  task  composition  is 

depleted  by.  This  may  be  accomplished  by  defining  an  initial  level  of 

resources  and  incrementing  that  level  as  a  function  of  the  unique  order  of 
faces  presented  to  it.  This  makes  the  dynamic  dependent  upon  order  but 
this  is  true  in  everyday  activities.  If  a  category  becomes  automatized,  then 
prior  to  depletion,  a  disengagement  would  occur  as  fluent  processing  does 
not  require  attention.  Note  too  that  the  incrementation  would  subtract  the 
specified  amount  of  resource  from  a  constant  increase  in  hemispheric 
resource  each  time  an  example  single  or  dual  task  is  processed.  This 

simulates  the  notion  of  resources  rebounding  from  exhaustion.  If  there  is 
not  a  recovery  of  attentional  strength,  then  there  would  never  be  enough 
resources  to  accomplish  tasks  in  either  hemisphere.  What  is  crucial  is  the 
threshold  level  in  which  hemispheric  transfer  is  necessary  to  continue. 
Such  dynamicism  is  not  possible  in  ID3-CPO,  yet  it  is  proposed  for  modules 
that  connect  classification  as  operatives  that-in  conjunction  with  rules  of 
engagement-  perturbate  actual  quantitative  and  qualitative  performance 
levels.  These  more  complex  notions  would  rely  upon  feedforward  and 
feedback  mechanisms  as  well. 

Rules  of  Engagement 

The  objects  (faces)  will  be  subjected  to  problem  solving  rules  of 
engagement  which  suggest  what  action  the  accessed  hemisphere  must  take. 
This  action  has  the  implication  of  performance  levels  for  single  and  dual 
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task  conditions,  and  the  amount  of  resources  expended.  Each  time  a  face  is 
presented  to  the  rules,  certain  levels  of  problem  solving  ensue  with  certain 
performance  results  and  certain  resource  depletions.  After  each  training 
set  pass,  ID3  categorizes  these  faces  based  on  certain  attributes  which  are 
learned.  The  MacSMARTS™  environment  allows  these  rules  to  be  associated 
directly  with  the  attribute-values  of  every  example  face.  Therein,  the 
program  learns  how  to  apply  rules  to  certain  classes  of  examples, 
dependent  on  the  induction  of  decision  trees  used  for  a  given  training  set. 
Appendix  B.  provides  examples  of  such  rules. 
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AN  EVALUATION  OF  CEREBRAL  WEEVIL 


The  demonstration  of  the  algorithm  for  CEREBRAL  WEEVIL  may  be 
viewed  as  a  systematic  progression  of  stages,  each  of  which  is  evaluated  as 
separate  modules.  In  this  way,  comparisons  can  be  made  with  ongoing  and 
past  experiments  to  gain  maximum  utility  from  the  model.  With  each 
stage,  it  is  anticipated  that  a  more  substantial  algorithm  and  more  highly 
evolved  representational  language  may  be  necessary.  Hence,  each  modular 
addition  to  the  overall  model  can  be  tested  to  ascertain  whether  any 
advantages  accrue,  and  whether  there  is  a  tradeoff  with  other  components 
and  performance  of  the  model.  The  intent  of  this  evaluation  is  to  observe 
the  performance  of  the  ID3-CPO  component  for  classifying  faces. 

Once  the  ID3-CPO  module  was  implemented,  an  empirical  test  was 
conducted  to  demonstrate  performance  results  in  terms  of  the  learning  set. 
Specifically,  the  independent  variable  for  this  experiment  was  the  extent  pf 
training  objects  obtained  from  a  universal  sample  of  all  possible  objects. 
This  may  be  operationally  defined  as  the  percentage  of  training  objects 
used  as  a  sample.  There  were  four  levels  of  the  independent  variable:  25% 
test  set,  50%  test  set,  75%  test  set,  and  100%  test  set  (which  represented  the 
universe).  The  dependent  variable  used  to  assess  performance  was 
accuracy.  The  percentage  of  correct  rules  of  engagement  (provided  in 
association  with  classification  parameters)  for  each  level  of  the  independent 
variable  represents  accuracy  in  this  experiment.  Other  measures  such  as 
cost  were  not  assessed  at  this  point  due  to  relative  ease  of  implementing 
this  module  and  the  negligible  drain  on  memory  given  the  sample  size  of 
each  set  tested  (e.g.  9,  18,  26,  and  36  objects,  respectfully).  However,  cost 
will  be  a  consideration  as  more  modules  are  implemented. 

Figure  1  graphically  provides  results  of  the  experiment.  As  shown, 
the  relative  success  of  inferring  engagement  rules  does  not  occur  until  the 
75%  test  set  and  this  is  just  slightly  over  50%  correct.  Hence,  the  system 
does  not  begin  to  have  high  reliability  of  inferencing  until  the  85-90% 
range.  In  part,  these  results  demonstrate  the  need  for  further 
development  in  the  areas  of  incrementation  and  probabilistic  knowledge 
representation  to  address  noise  in  the  samples.  The  construction  of 
additional  modules  is  hypothesized  to  produce  more  directed  inferencing 
such  that  high  reliability  would  begin  in  the  50-60%  range.  This  first  test 
provides  a  watermark  upon  which  to  judge  successive  implementations. 


One  reason  the  inference  ability  does  not  appear  until  the  85-90%  range 
may  be  due  to  high  interdependency  that  is  established  by  the  resource 
allocation  factor  as  it  is  not  completely  crossed  with  the  other  factors. 


*  of  Universe  Sample  used  for  Training 

Figure  1.  Inference  performance  over  %  training  set 

Langley  (1987)  suggests  that  another  way  to  evaluate  the  system  is  to 
compare  it  with  human  learning.  In  many  ways  human  learning  is  the 
raison  d£tre  of  CEREBRAL  WEEVIL.  Indeed,  the  formulation  of  the 
engagement  rules,  the  objects,  and  the  attribute-value  pairs  were  either 
directly  taken  or  derived  from  experimental  results  in  hemispheric 
processing.  In  fact  the  attributes  of  familiarity  and  resource  availability 
directly  tract  with  the  perceptual  learning  of  faces.  In  the  ID3-CPO  module, 
an  attempt  has  been  made  to  simulate  human  learning  of  faces  as  the  faces 
become  more  familiar.  However,  the  learning  is  perturbated  by  the 
complex  problem  of  allocating  resources  as  "sustenance"  to  continue  to 
propel  the  learning.  Because  ID3-CPO-  as  currently  implemented-  does  not 
draw  on  values  in  working  memory,  these  two  human  learning  aspects 
must  be  simulated.  This  reveals  an  interesting  facet  of  machine  learning 
programs. 
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On  the  one  hand,  inference  ability  is  desired  as  a  efficacious  and 
smooth  transitional  process  for  performance-based  systems.  This  can  be 
contrasted  with  human  learning  systems  (as  exemplified  by  empirical 
approaches)  which  often  show  signs  of  messiness,  bias,  and  "climax 
learning".  In  many  circumstances  the  performance  shown  in  Figure  1  is 
similar  to  the  actual  trial-to-trial  performance  of  subjects.  The  learning 
strategies  which  subjects  employ  tend  to  be  either:  a.)  immediate 
recognition,  or  2.)  prototype  formation  which  requires  many  examples 
before  climax  recognition  occurs.  The  performance  of  the  ID3-CPO  module 
is  very  much  representative  of  prototype  strategy  formation.  It  also 
demonstrates  the  nonincremental  nature  that  results  after  many 
extractions.  In  conclusion,  the  evaluation  suggests  that  CEREBRAL  WEEVIL 
has  partially  modeled  human  performance  in  a  pattern  recognition  task. 
Yet  there  is  room  for  improvement,  both  on  empirical  and  performance 
levels.  The  following  section  briefly  reaffirms  the  commitment  to  develop 
additional  modules  to  address  these  problems. 
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CONCLUDING  REMARKS 


One  of  the  hazards  of  using  the  ID3  framework  is  the  necessity  for  a 
voluminous  amount  of  training  examples  to  approach  correct 
generalizations.  As  Porter  &  Kibler  (1986)  aptly  state  the  volume  of 
training  examples  does  not  always  provide  correct  generalizations. 
Training  may  even  be  described  as  fortuitous.  Stated  in  terms  appropriate 
to  hemispheric  processing,  this  suggests  that  if  a  face  is  representative  of 
the  currently  constructed  prototype  then  the  face  has  a  high  probability  of 
recognition.  However,  if  the  face  deviates  from  the  prototype,  whereby,  a 
new  prototype  or  another  strategy  may  be  required,  then  this  instantiation 
acts  to  discard  the  the  current  best  strategy.  However,  the  brain  adjusts 
very  flexibly  to  assimilate  the  face  via  another  method.  CEREBRAL  WEEVIL 
currently  can  emulate  human  learning  on  the  first  step.  What  remains  is  to 
evolve  the  system  to  regulate  it's  performance.  Such  regulation  requires  at 
least  two  new  elements  and  probably  more.  First,  the  WEEVIL  must  have 
other  learning  powers  upon  which  it  can  bias  itself.  Here,  bias  is  used  in 
the  sense  that  the  system  biases  itself  towards  another  learning  power 
upon  experiencing  dead-ends  in  the  current  power.  This  is  similar  to 
notions  expressed  by  Utgoff  (1986).  This  means  building  up  extensions  of 
the  ID3-CPO  or  new  interactive  modules  that  procure  new  representational 
and  procedural  biases.  Specifically,  the  areas  of  incremental  clustering, 
probabilistic  representation,  and  incremental  adjustment  are  all  good 
candidates  for  future  work.  The  point  to  be  remembered  is  that  the  system 
must  remain  integrated  to  form  a  machine  learning  collage  that  regulates 
the  method  according  to  the  performance  sampled. 

One  proposal  for  such  a  system  would  connect  nonincremental  and 
incremental  methods  together  to  transition  between  either  strategy,  based 
upon  the  appropriate  cognitive  strategy.  These  two  algorithms  may 
proceed  in  parallel,  but  the  key  is  that  they  are  mutually  deterministic.  An 
integration  may  proceed  by  creating  a  version-like  search  space  (see 
Mitchell,  1982)  of  possibilities  between  incremental  and  nonincremental 
solutions.  The  convergence  on  the  proper  state  could  be  based  on  an 
evaluation  metric  that  compares  the  extent  of  divergence  for  each 
incrementation  with  the  current  best  generalized,  strategy  tree.  This 
metric  could  be  averaged  for  each  case  such  that  initially  a  "best  fit"  could 
occur  at  a  state  half  way  between  the  incremental  and  nonincremental 
classification.  With  successive  training  sets,  one  would  observe  how  the 
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system  learns  to  learn  by  using  a  version  search  strategy  to  learn  which  is 
the  best  learning  algorithm  to  use.  This  would  be  important  as  well  for  the 
cognitive  model  as  it  would  represent  how  the  brain  adapts  from  one 
strategy  to  another  as  a  function  of  developing  familiarity  with  attributes. 
In  fact,  interesting  tradeoffs  could  be  observed  between  incremental, 
developmental,  and  automatic  strategies  in  learning  faces  to  be  recognized. 
Once  the  system  settled  down,  a  specific  probabilistic  representation  could 
be  designated  for  preference  between  strategies  of  a  given  training  set. 
The  generalized  nonincremental  strategy  could  be  changed  by  a  teacher  or 
the  system  itself  could  act  as  the  teacher  in  a  way  akin  to  the  SAGE  system 
(Langley, 1985).  Hence,  a  statistical  equilibrium  between  machine  learning 
strategies  becomes  contingent  on  learning  search  heuristics.  Inherently, 
the  incrementation-nonincrementation  continuum  always  reflects  the 
current  learning  of  the  organism.  Certainly,  this  is  analogous  to  the  human 
learning  by  adaptively  using  both  sides  of  his/her  brain. 

The  second  need  revolves  around  creating  and  accessing  working 
memory  knowledge  that  can  portray  variables  as  continuous,  discrete, 
nominal,  or  numeric-probabilistic  states.  By  directly  linking  the  modules 
through  a  common  working  memory,  traces  can  be  used  to  direct  search, 
and  values  can  be  immediately  percolated  through  the  system. 
Consequently,  the  system  can  be  self-regulative  across  many  variations. 
This  allows  a  greater  dynamicism  and  also  allows  greater  flexibility  in 
creating  rules  of  engagement.  The  ability  to  draw  on  knowledge  in  working 
memory  allows  sensitivity  to  knowledge  that  spawns  greater 
generalizations  and  thus  creates  a  much  more  robust  system. 
Psychologically,  it  also  allows  development  of  cognitive  models  for 
determination  of  when  and  how  inert  knowledge  occurs. 

In  conclusion,  it  has  been  the  author's  hope  to  develop  a  machine 
learning  model  of  hemispheric  cognition,  within  the  context  of  discovering 
issues  related  to  implementing  learning  mechanisms  as  well  as 
understanding  psychological  reality.  The  approach  taken  might  be 
classified  as  "empiricity".  Learning  mechanisms  are  modeled  based  on  an 
empirical  analysis  of  psychological  reality  and  are  subsequently  tested 
empirically  for  their  demonstration  of  that  reality.  Any  variation  in 
differences  are  considered  in  terms  of  changes  to  both  future  experimental 
studies  of  humans,  and  new  evolvements  in  the  programs  used  to  model 
the  reality.  This  approach  has  been  successful  in  this  first  model  as  it 
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provides  an  evolutionary  path  of  hemispheric  cognition  that  signals  where 
the  psychological  research  and  the  machine  learning  model  have  been, 
where  they  currently  stand,  and  where  they  must  go;  as  well  as  forecasting 
the  means  to  get  there.  Finally,  this  project  has  truly  demonstrated  the 
necessary  interdependence  between  mechanisms  of  human  and  machine 
learning  that  can  act  to  espouse  the  epicenter  for  future  cognitive  science 
efforts. 
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APPENDIX  A 


A  User's  Guide  into  CEREBRAL  WEEVIL 
User  as  Cognitive  Psychologist 

First,  this  guide  will  seek  to  introduce  an  understanding  of  how  to  use 
CEREBRAL  WEEVIL.  Much  of  the  information  for  this  appendix  will  be 
drawn  from  the  MacSMARTS™  user  manual  (1987),  but  placed  in  the 
context  of  using  and  creating  the  WEEVIL.  Before  describing  how  a  user 
works  with  the  system,  it  is  desirable  to  paint  the  landscape  upon  which 
the  user  is  juxtapositioned. 

The  user  in  this  context  is  the  cognitive  psychologist.  The  user  desires 
to  present  the  homunculus  (WEEVIL)  with  different  possibilities  or 
combinations  of  the  independent  variables  to  see  how  the  system  classifies 
these  examples.  The  system  may  respond  in  a  way  that  indicates  a  certain 
level  of  performance  and/or  suggest  how  adaptation  for  better 
performance  can  occur  (e.g,  suspend  operations  in  current  hemisphere  and 
switch  to  opposite  hemisphere  for  more  efficient  performance).  Therein, 
when  a  user  assumes  the  role  of  psychologist,  he/she  is  basically  running 
an  experiment  to  see  how  WEEVIL  responds  to  the  example  selected.  The 
variability  supplied  to  such  experiments  may  occur  by  manipulating  the 
independent  variables,  the  dependent  variables,  the  size  of  the  training  set, 
and  other  conditions  allowed  by  MacSMARTS™.  The  WEEVIL  actually  has 
two  modes  of  operation.  The  mode  used  when  the  user  assumes  the  role  of 
psychologist  is  termed  "rat  race".  It  implies  that  the  user  accepts  the 
default  values  currently  programmed  into  the  system  and  proceeds  by 
running  the  system  in  a  hypothesized  experiment.  If  the  user  wants  to  go 
beyond  what  the  current  system  knows,  he/she  assumes  the  role  of 
knowledge  engineer  and  enters  the  mode  termed  "surgeon".  This  mode 
allows  the  user  to  perform  surgery  upon  the  homunculus  to  create  a  new 
art  of  the  state.  The  role  of  the  user  as  knowledge  engineer  will  now  be 
described. 

User  as  Knowledge  Engineer 

The  user  may  assume  this  role  upon  sensing  that  the  system  does  not 
capture  necessary  conditions  to  be  tested  in  the  experimental  run.  The 
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underlying  function  of  WEEVIL  is  to  test  out  predictions  based  on  the 
learning  derived  through  certain  training  sets  and  different  attribute-value 
pairings.  Hence,  if  one  wants  to  prescribe  changes  that  go  beyond  current 
valuations,  then  there  must  be  a  switch  into  the  surgery  mode.  In  order  to 
do  surgery,  some  basic  tools  of  the  operating  environment  must  be 
introduced.  The  tools  that  the  user  has  available  are  a  mouse,  a  user 
interface  consisting  of  a  variety  of  windows,  menus,  buttons,  and 

spreadsheets  for  entering  examples.  For  more  specific  information  refer  to 
the  MacSMARTS™  manual.  At  this  point,  it  would  be  useful  to  take  a 
guided  tour  of  the  operating  room  and  define  the  underlying  form  of  the 
WEEVIL  upon  which  the  tools  of  surgery  must  be  applied. 

The  first  stop  is  the  knowledge  base.  It  forms  the  brain  of  the  system. 

To  access  the  knowledge  base  a  user  must  pull  down  an  operating  menu 
entitled,  "example  editor".  The  menu  is  accessed  from  a  title  at  the  top  of 
the  screen  termed  "logic".  There  are  three  main  components  of  the  example 
editor.  They  are  factors,  advice,  and  examples.  One  must  select  factors  to 
encode  new  independent  variables.  The  user  moves  the  mouse  to  the 
"new"  button  and  then  types  in  the  new  variable(s)  desired.  If  variables 

are  to  be  removed  the  user  just  moves  the  mouse  to  highlight  the  variable 

to  be  removed,  then  moves  the  cursor  to  the  cancel  button.  After,  a  new 
variable  is  programmed,  it  will  appear  in  the  "factor  window"  on  the  left 

side  of  the  screen.  On  the  right  side  of  the  screen  the  user  must  type  in  the 

various  conditions  associated  with  the  independent  variable.  They  are 
termed  "choices"  and  are  entered  the  same  way  as  factors.  Once  entered 
they  appear  in  a  "choices  window"  on  the  right  side  of  the  screen. 

Note  that  at  the  top  of  the  screen  the  user  may  supply  complete 
questions  associated  with  each  factor  entered.  This  allows  the  system  to 

prompt  the  user  with  a  question  to  supply  the  necessary  parameters  when 
in  the  ratrace  mode.  The  next  step  is  for  the  user  to  enter  classes  of  advice 
(rules  of  engagement)  for  the  system.  A  user  may  cancel  advice  as  well. 
This  is  accomplished  by  the  same  mousing  operations  explained  for 
creating/cancelling  factors.  Once  again  statements  to  elaborate  the  advice 
may  be  typed  in  at  the  top  of  the  screen.  Once  this  step  is  completed,  the 
user  progresses  to  example  creation.  A  user  highlights  the  new  button  to 
create  a  new  example  or  highlights  the  cancel  button  (once  the  cursor  has 
been  moved  to  the  example  number  which  is  now  shown  in  the  example 
spreadsheet)  to  remove  the  example.  The  spreadsheet  columns  show  the 


example  number,  the  factors  currently  programmed  in  the  system,  and  the 
associated  advice.  The  user  must  supply  the  respective  parameters  for 
each  factor  and  advice  attached  to  each  example.  This  is  the  information 
typed  into  the  rows  of  the  spreadsheet.  Once  a  set  of  examples  are 
complete,  then  the  user  clicks  on  "done". 

The  next  stop  in  the  tour  completes  the  surgery.  It  is  now  time  to 
envoke  the  ID3  component.  To  do  this  the  the  user  clicks  on  the  logic  title 
and  pulls  down  the  menu  whereupon  the  selection  entitled  "create  example 
based  rule"  is  highlighted.  This  lets  ID3  operate  on  the  training  set 
currently  used.  It  responds  with  a  logic  worksheet  that  shows  the  general 
relations  between  facts,  rules,  and  advice.  At  this  point  the  WEEVIL  is  in 
the  recovery  room  having  just  undergone  surgery.  If  the  user  desires  to 
run  the  system,  he  /she  must  go  to  the  logic  title  and  pull  down  the  menu 
and  select  "run  advisor".  This  move  now  returns  WEEVIL  to  the  ratrace 
mode,  whereupon  it  asks  the  successive  questions  associated  with  the  new 
parameters  programmed  into  the  system.  To  try  different  example  faces, 
the  user  merely  clicks  on  the  "rerun"  button  and  the  system  clears  itself  to 
run  the  next  example.  If  another  previously  created  (and  saved)  test  set  is 
desired,  the  user  clicks  on  the  "get  new  KB"  button  and  the  system  shows 
all  the  KBs  available.  The  user  highlights  the  one  required  and  continues  as 
previously  described. 

This  then  concludes  the  user's  guided  tour  of  the  operating 
environment.  Although  brief,  it  encapsulizes  the  basics  of  the  system.  If 
additional  information  is  desired  please  contact  the  author  or  consult  the 
MacSMARTS™  users  manual. 


APPENDIX  B 


Examples  of  Rules  of  Engagement 


MacSMARTS.tog 

DateTueeday.  November  28.  198flTlme:1 1:40  AM 
Rsply:  FRONTAL  VIEW 

Ountion:  What  hemisphere  w  the  laca  presented  to? 

Rapiy:  RIGHT!  OdSPHCnE 

Question:  What  level  of  (amitoanty  does  the  laca  poaaaas? 

Reply:  LOW  FAAHUARlTY 

Quasbon:  What  is  tha  level  of  resources  expended ? 

Rapiy:  LOW 

Advica:  HEMISPHERIC  CONTROL.**  not  transfer  control  to  tha  other  henasphere;  continue  to  procaas  m  hernuph. 

narun 

Quasbon:  What  is  lha  perspective  of  tha  laca  presented? 

Rapiy  :  HONFPONTAL  VEW 

Question:  What  hemisphere  is  tha  faca  presented  to? 

Rapiy:  LEFT  tCkdSPFERE 

Quasbon:  What  level  of  tamrtianty  doas  tha  laca  poaaaas? 

Rapiy:  UNFAJuiLIAR 

Ouaabon:  What  is  tha  lavai  of  raaourcaa  axpaodad? 

Reply:  NEOHHH 

Advica:  HEMISPHERIC  CONTROL: Tha  system  has  adapted  to  produce  efficient  pari  on  tha  bng  task  but  Isas  than 

Rerun 

Quasbon:  What  m  tha  perspective  of  the  face  presented? 

Reply:  fOFRONTALVEW 

Quaauon:  What  hemisphere  is  tha  (ace  presented  to? 

Reply:  RIGHT  HEMBPHERE 

Quasbon:  What  level  of  f ami*t arty  does  the  face  possess? 

Reply:  UNFAMHJAR 

Quasbon:  What  is  tha  level  of  raaourcaa  expanded? 

Reply:  UGH 

Advica:  HEMISPHERIC  CONTROL  The  system  has  adapted  to  produce  efficient  pert  on  the  lace  task  but  leas  than 


Quasbon:  What  la  tha  parspectiva  of  tha  faca  presented? 

Reply:  FRONTAL  VEW 

Quasbon:  What  homwphsre  it  tha  faca  preeemed  to? 

Reply:  RIGHT  VOW8PHBC 

Quasbon:  What  level  of  famHianty  does  tha  faca  possets? 

Reply  HIGH  FAMHJARfTY 

Question:  What  is  tha  level  of  raaourcaa  expanded? 

Reply:  LOW 

Advice:  HEMISPHERIC  CONTROL;Do  not  transfer  control  to  the  otter  hemisphere;  continue  to  prooaaa  m  harrwaphi 

Rerun 

Quatbon:  What  ia  tha  perspective  of  the  face  presented? 

Reply:  NCNFRONTAL  VEW 

Question:  What  hemisphere  *  tha  faca  presented  to? 

Reply:  LEFT  FEM8PMEFE 

Quasbon:  What  level  of  famAianty  does  tha  face  possets? 

Reply:  HIGH  FAMHJARfTY 

Ouaabon.  Whai  lb  tha  level  of  raaourcaa  expanded? 

Reply:  UGH 

Advica:  HEMISPHERIC  CONTROLThe  system  has  adapted  to  produoa  poor  performance  on  all  tasks  demanded. 

Quasbon:  Whai  la  the  parspectiva  of  tha  faca  presented? 

Reply:  NONFRONT AL  VEW 

Ouaabon:  What  hemisphere  is  tha  face  presented  to? 

Reply:  LEFT  HEM8PHERE 

Quasbon:  What  level  of  familiarity  does  the  faca  poetess? 

Reply:  LOW  FAMHJARITY 

Ouaabon:  What  Is  tha  lavai  of  raaourcaa  expanded? 

Reply:  MEDAJM 

Advica:  HEMISPHERIC  CONTROL: Try  to  transfer  from  currant  hemephere  to  other  alee  suspend  operation  and 

Rerun 

Ouaabon:  What  la  tha  perapecbvs  of  tha  faca  presented? 

Reply:  NONFRONTAL  VEW 

Ouaabon-  What  hemisphere  i*  tha  faca  prat  am  eg  to? 

Reply:  RfGHT  WEMBPHERE 

Ouaabon:  What  level  of  famibanty  does  tha  faca  poaaaas? 

Reply:  HIGH  FAMHJARfTY 

Ouaabon:  What  it  the  level  of  raaourcaa  expanded? 

Reply:  LCOHOH 

Advica.  HEMISPHERIC  CONTROL: Tha  system  has  adapted  »  produce  efficient  part  on  the  laca  task  but  leas  than 
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